首页> 外文OA文献 >LSTM: A Search Space Odyssey
【2h】

LSTM: A Search Space Odyssey

机译:LsTm:搜索空间奥德赛

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Several variants of the Long Short-Term Memory (LSTM) architecture forrecurrent neural networks have been proposed since its inception in 1995. Inrecent years, these networks have become the state-of-the-art models for avariety of machine learning problems. This has led to a renewed interest inunderstanding the role and utility of various computational components oftypical LSTM variants. In this paper, we present the first large-scale analysisof eight LSTM variants on three representative tasks: speech recognition,handwriting recognition, and polyphonic music modeling. The hyperparameters ofall LSTM variants for each task were optimized separately using random search,and their importance was assessed using the powerful fANOVA framework. Intotal, we summarize the results of 5400 experimental runs ($\approx 15$ yearsof CPU time), which makes our study the largest of its kind on LSTM networks.Our results show that none of the variants can improve upon the standard LSTMarchitecture significantly, and demonstrate the forget gate and the outputactivation function to be its most critical components. We further observe thatthe studied hyperparameters are virtually independent and derive guidelines fortheir efficient adjustment.
机译:自1995年问世以来,已针对循环神经网络提出了长期短期记忆(LSTM)架构的几种变体。最近几年,这些网络已成为各种机器学习问题的最新模型。这引起了人们对重新认识典型LSTM变体的各种计算组件的作用和实用性的兴趣。在本文中,我们针对三种代表性任务,对八个LSTM变体进行了首次大规模分析:语音识别,手写识别和和弦音乐建模。使用随机搜索分别优化了每个任务的所有LSTM变体的超参数,并使用功能强大的fANOVA框架评估了它们的重要性。总而言之,我们总结了5400个实验运行的结果(约15年的CPU时间),这使我们的研究成为LSTM网络上同类研究中最大的一次。并证明忘记门和输出激活功能是其最关键的组件。我们进一步观察到,所研究的超参数实际上是独立的,并为其有效调整导出了指导原则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号